POLS 2972Q

Quantitative Analysis in Political Science

Lecture 7 | Data Visualization

Plan for Today

  • Why visualizing data is important
  • Strategies for effective visualizations
  • In class activity

Importance of Visualizing Data

  • Exploration vs. explanation
  • Get a glimpse at patterns and relationships within the data
  • When done well, creates value for stakeholders
  • When done poorly, spreads misinformation and leads to false conclusions and flawed knowledge

Ineffective Visualizations

  • Use of 3-D to model 1-D or 2-D relationships
  • Meaningless use of color
  • Lack of informative labeling/titles
  • Overly fancy charts
  • Cluttered visualizations
  • Pie charts (just…don’t)

Strategies for Effective Visualization

  • Just because you can create a particular type of graph or visualization, doesn’t always mean that you should
  • Start with questions:
    • What do they convey?
    • What is the appropriate visual tool?
    • How can I minimize the amount of ?
    • How can i draw attention where I want it?
    • Can I add information to make the story clear?

How to Evaluate a Visualization

  • Who is the intended audience?
  • What story does it tell?
    • What is the intent (explore or explain)?
  • Is the story compelling?
  • Was the right type of visual used?
  • What information is missing?
    • Implicit vs Explicit

Effective Visualization

  • Understand the context
  • Choose an appropriate visual display
  • Eliminate clutter
  • Focus attention where you want it
    • Aesthetics
  • Think like a designer
  • Tell a story

Telling Your Story

  • Know your audience
  • Make your graphics self-explanatory
    • Labels, titles, etc.
  • What interpretations/conclusions are you leaving your audience to make?
    • Are they correct?

ggplot

  • Load the mpg data
library(tidyverse)

mpg
# A tibble: 234 × 11
   manufacturer model      displ  year   cyl trans drv     cty   hwy fl    class
   <chr>        <chr>      <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
 1 audi         a4           1.8  1999     4 auto… f        18    29 p     comp…
 2 audi         a4           1.8  1999     4 manu… f        21    29 p     comp…
 3 audi         a4           2    2008     4 manu… f        20    31 p     comp…
 4 audi         a4           2    2008     4 auto… f        21    30 p     comp…
 5 audi         a4           2.8  1999     6 auto… f        16    26 p     comp…
 6 audi         a4           2.8  1999     6 manu… f        18    26 p     comp…
 7 audi         a4           3.1  2008     6 auto… f        18    27 p     comp…
 8 audi         a4 quattro   1.8  1999     4 manu… 4        18    26 p     comp…
 9 audi         a4 quattro   1.8  1999     4 auto… 4        16    25 p     comp…
10 audi         a4 quattro   2    2008     4 manu… 4        20    28 p     comp…
# ℹ 224 more rows
  • What does the data look like?
mpg <- mpg

glimpse(mpg)
View(mpg)
?mpg
  • Create a plot that examines how the size of a cars engine impacts its fuel efficiency.
ggplot(data = mpg) + 
  geom_point(mapping = aes(x = displ, y = hwy))

  • Recall

    • ggplot() is the plot function
      • Creates a coordinate system that you can add layers to
    • (Data =) is where you specify the dataset that you are using
    • geom_point() is an added layer, which specifies how the data will be plotted
    • mapping defines how variables in your dataset are mapped to visual properties
    • aes specify which variables to map to the x and y axes
  • For example
ggplot(data = mpg)

Aesthetic Mapping

  • Aesthetics
    • Is a visual property of the objects in your plot
    • Include things like the size, the shape, or the color of your points
  • For example, lets reexamine the plot from above, but examine the data by vehicle class (type)
ggplot(data = mpg) + 
  geom_point(mapping = aes(x = displ, y = hwy, color = class))

  • Lets make a few more aesthetic changes
ggplot(data = mpg) + 
  geom_point(mapping = aes(x = displ, y = hwy, fill=class), shape=24, size=4)

Break